Markup of a Test Suite with SGML

نویسنده

  • Martin Volk
چکیده

Recently, there have been various attempts to set up a test suite covering the syntactic phenomena of a natural language (cp. [Flickinger et al. 1989], [Nerbonne et al. 1993]). The latest e ort is the TSNLP project (Test Suite for Natural Language Processing) within the Linguistic Research and Engineering (LRE) framework sponsored by the European Union (cp. [Balkan et al. 1994]). These test suites are meant for the testing of NLP software regarding their coverage of syntactic phenomena. [Volk 1995] showed that a well-organised test suite can also be used to support incremental grammar development and grammar documentation. The key issues in the organisation of a test suite are the ease of extensibility and interchangeability as well as the avoidance of redundancy. We have implemented a test suite, which is optimized for the avoidance of redundancy and we report on the trade-o for extensibility and interchangeability. We de ne a test suite as a collection of syntactically well-formed natural language sentences and contrastively selected ungrammatical sentences (termed: non-sentences). The non-sentences can be used to check for overgeneration of the grammar. The collection is built according to the following principles:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TeX and SGML: A Recipe for Disaster?

The relationshp between T$ and SGML (Standard Generalised Markup Language, IS0 8879) has often been uneasy, with adherents to one system or the other displaying symptoms remininscent of the religious wars popular between devotees of T$ and of word-processors. SGML and T$ can in fact coexist successfully, provided features of one system are not expected of the other. T h s paper presents a pilot...

متن کامل

The Association of American Publishers (aap) Refe- Rence Manual on Electronic Manuscript Preparation and Markup Defines a Document As: Lifecycle-phases of Documents

What SGML (and TEX) is all about is given in a nutshell. Markup of example document elements, by SGML and LATEX, are provided. Coupling SGML to TEX is considered by direct translation and by the intermediate procedural markup phase. Interfacing SGML to (La)TEX is also addressed. Some guidelines are provided in order to decide when SGML, or TEX (alone, both, or neither) might be beneficial. It i...

متن کامل

Why Use SGML?

The Standard Generalised Markup Language (SGML) is a recently-adopted International Standard (ISO 8879), the first of a series of proposed Standards in the area of Information Processing — Text and Office Systems. The paper presents some background material on markup systems, gives a brief account of SGML, and attempts to clarify the precise nature and purpose of SGML, which are widely misunder...

متن کامل

What Should Markup Really Be? Applying theories of text to the design of markup systems

Introduction The issue of what text really is, and how it affects our notions of proper text representation has been with us almost from the beginning of text encoding [Goldfarb 1981, Reid 1980, Coombs, et al. 1987, DeRose, et al. 1990, Renear et al.]. The simplest reasonable view, that text is fundamentally an ordered hierarchical structure, determined by its editor and author, is an early one...

متن کامل

SGML - Lite { An SGML - based Programming Environment

Literate Programming is a documentation method that attempts to maintain consistency among the various design and program documents of a software system. Unfortunately the majority of the literate programming tools do not have appropriate user interfaces and require the users to learn complicated and cryptic tagging languages. SGML is a metalanguage used to specify markup or tagging languages t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996